Skip to content

Conversation

@adamperlin
Copy link
Contributor

This PR adds support for generating an import section to the WasmObjectWriter. We currently import:

  • __stack_pointer
  • __r2r_start

Various data segments are then placed into __r2r_start + <segment_offset>. We use the constant expressions extension for these address calculations.

Lay out R2R data segments relative to imported __r2r_start global
@adamperlin adamperlin requested a review from kg January 28, 2026 23:53
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 28, 2026
@kg
Copy link
Member

kg commented Jan 28, 2026

I will try and update the test harness to be compatible with this

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

Copilot AI review requested due to automatic review settings January 29, 2026 23:38
@adamperlin adamperlin marked this pull request as ready for review January 29, 2026 23:42
@adamperlin
Copy link
Contributor Author

@kg I'd love a review on this when you have a chance!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial support for emitting a Wasm import section from WasmObjectWriter, and switches R2R data segment placement to use constant expressions based on an imported __r2r_start symbol.

Changes:

  • Introduces a new wasm.import object node section and emits it as Wasm section type Import.
  • Reworks combined data segment placement to use an instruction expression (global.get __r2r_start + i32.const offset + i32.add) instead of a fixed DataStartOffset.
  • Adds a small Wasm encoding model (imports, globals, memory type, expression/instruction encoding) and a helper to write UTF-8 strings with a length prefix.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmObjectWriter.cs Adds import section emission and uses __r2r_start + offset constant expressions for data segment placement.
src/coreclr/tools/Common/Compiler/ObjectWriter/WasmNative.cs Adds encodable abstractions/types for Wasm expressions and import encodings (memory/global types, etc.).
src/coreclr/tools/Common/Compiler/ObjectWriter/SectionWriter.cs Adds WriteUtf8WithLength helper for Wasm-style length-prefixed strings.

Comment on lines +224 to +230
class WasmInstructionGroup : IWasmEncodable
{
readonly WasmExpr[] WasmExprs;
public WasmInstructionGroup(WasmExpr[] wasmExprs)
{
WasmExprs = wasmExprs;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WasmInstructionGroup uses a field named WasmExprs without the underscore prefix used elsewhere in this file (e.g., _types, _params, _returns). Rename to _wasmExprs (and update usages) to match the established local naming convention and improve readability.

Copilot uses AI. Check for mistakes.
Comment on lines +402 to +427
public WasmMemoryType(WasmLimitType limitType, uint min, uint? max = null)
{
if (LimitType == WasmLimitType.HasMinAndMax && !Max.HasValue)
{
throw new ArgumentException("Max must be provided when LimitType is HasMinAndMax");
}

LimitType = limitType;
Min = min;
Max = max;
}

public override int Encode(Span<byte> buffer)
{
int pos = 0;
buffer[pos++] = (byte)LimitType;
pos += DwarfHelper.WriteULEB128(buffer.Slice(pos), Min);
if (LimitType == WasmLimitType.HasMinAndMax)
{
pos += DwarfHelper.WriteULEB128(buffer.Slice(pos), Max!.Value);
}
return pos;
}

public override int EncodeSize() => 2;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WasmMemoryType has two issues:

  1. The constructor validates LimitType/Max fields before they’re assigned, so the check never does what it intends. It should validate the limitType/max parameters.
  2. EncodeSize() returns a constant 2, but Encode(...) writes a variable-length payload (1 byte + ULEB128(min) [+ ULEB128(max)]). This will cause callers using EncodeSize() (e.g., import encoding) to allocate the wrong buffer size and can lead to truncated writes/corruption. Compute the size using DwarfHelper.SizeOfULEB128 for min/max and include the leading limit-type byte.

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +115
int size = import.EncodeSize();
import.Encode(writer.Buffer.GetSpan(size));
writer.Buffer.Advance((int)size);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WriteImport advances the section buffer by EncodeSize() rather than by the actual byte count returned from Encode(...). If EncodeSize() and Encode() ever diverge (or a future implementation writes fewer bytes), this will either leave garbage bytes in the section or corrupt subsequent writes. Capture the return value from Encode(...), assert it matches the expected size, and advance by the encoded byte count.

Suggested change
int size = import.EncodeSize();
import.Encode(writer.Buffer.GetSpan(size));
writer.Buffer.Advance((int)size);
int expectedSize = import.EncodeSize();
Span<byte> buffer = writer.Buffer.GetSpan(expectedSize);
int encodedSize = import.Encode(buffer);
Debug.Assert(encodedSize == expectedSize, $"Encoded import size mismatch in {nameof(WriteImport)}.");
writer.Buffer.Advance(encodedSize);

Copilot uses AI. Check for mistakes.
Comment on lines +187 to +193
WasmInstructionGroup GetR2StartOffset(int offset)
{
return new WasmInstructionGroup([
Global.Get(1), // __r2r_start
I32.Const(offset),
I32.Add,
]);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetR2StartOffset hard-codes Global.Get(1) for __r2r_start. This couples data segment placement to the current import/global ordering and will silently break if another imported/defined global is added earlier. Consider tracking the global index for __r2r_start when writing imports (e.g., via constants/enums or by computing the index from the import list) and use that value here.

Copilot uses AI. Check for mistakes.
ulong numPages = (contentSize + (1<<16) - 1) >> 16;

_defaultImports[0] = new WasmImport("env", "memory", WasmExternalKind.Memory,
new WasmMemoryType(0x00, (uint)numPages)); // memory limits: flags (0 = only minimum)
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new WasmMemoryType(0x00, (uint)numPages) passes an integer literal where WasmMemoryType expects a WasmLimitType, which will not compile. Use WasmLimitType.HasMin (or an explicit cast) for the first argument.

Suggested change
new WasmMemoryType(0x00, (uint)numPages)); // memory limits: flags (0 = only minimum)
new WasmMemoryType(WasmLimitType.HasMin, (uint)numPages)); // memory limits: flags (HasMin = only minimum)

Copilot uses AI. Check for mistakes.
Comment on lines +330 to +333
// Calculate the required memory size based on the combined data section size
ulong contentSize = (ulong)SectionByName(WasmObjectNodeSection.CombinedDataSection.Name).ContentSize;
ulong numPages = (contentSize + (1<<16) - 1) >> 16;

Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numPages is computed only from the combined data section’s encoded size. This can (a) produce a 0-page minimum when there are no data segments, and (b) ignores any non-zero base placement such as the previous DataStartOffset (and now potentially __r2r_start). Since the import’s minimum is the module’s only enforcement on the host-provided memory size, consider clamping to at least 1 page and/or adding a fixed base reservation (stack / __r2r_start placement) so the requested minimum can’t be smaller than the runtime’s expected address range.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

}

// Simple DSL wrapper for creating Wasm expressions
static class Global
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put the DSL in a namespace of its own

private WasmDataSection CreateCombinedDataSection(int dataStartOffset)
private WasmDataSection CreateCombinedDataSection()
{
WasmInstructionGroup GetR2StartOffset(int offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be GetR2RStartOffset

@kg
Copy link
Member

kg commented Jan 30, 2026

This looks fairly good, all of Copilot's feedback seemed accurate though so we need to address it.
The Encode/EncodeSize split is a little awkward but I don't see an obvious way to get rid of it other than doing something weird with empty spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-crossgen2-coreclr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants